Seeing Stars from Reviews by a Semantic-based Approach with MapReduce Implementation

نویسندگان

  • Pengfei Liu
  • Xiaojun Qian
  • Helen Meng
چکیده

This study concerns the problem of aspect-level opinion (sentiment) mining from online reviews. The problem consists of two fundamental sub-tasks: aspect extraction (identify specific aspects of the product from reviews), and aspect rating estimation (offer a numerical rating for each aspect). Solving this problem is important and useful for many applications, e.g., providing aspect-level review summaries to consumers for better decision making, and for product manufacturers to collect summarized user feedback. Our objective is to propose a semantic-based approach for aspect level opinion mining from massive amounts of reviews in a scalable fashion. The MapReduce implementation for this approach obtains much runtime reduction compared with the single-process implementation. Experimental results show that the runtime reductions by the MapReduce implementation are almost linear to the number of mappers, e.g., around 7.4 times reduction with 10 mappers on the TripAdvisor dataset and 2.6 times reduction with 4 mappers on the Yelp dataset. The number of mappers and reducers can be configured on demand to handle very large datasets in a scalable fashion. Moreover, the semantic-based approach obtains good performance for aspect rating estimation on the TripAdvisor dataset, with the MAE score of around 1.0 on all aspects, which means that the average deviation between the human rating and the estimated rating is around 1 star. The source code of our implementation for the sentiment-based approach can be downloaded from https://github.com/ppfliu/aspect-opinion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Use of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems

  One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...

متن کامل

An Effective and Efficient MapReduce Algorithm for Computing BFS-Based Traversals of Large-Scale RDF Graphs

Nowadays, a leading instance of big data is represented by Web data that lead to the definition of so-called big Web data. Indeed, extending beyond to a large number of critical applications (e.g., Web advertisement), these data expose several characteristics that clearly adhere to the well-known 3V properties (i.e., volume, velocity, variety). Resource Description Framework (RDF) is a signific...

متن کامل

Scalable Distributed Reasoning Using MapReduce

We address the problem of scalable distributed reasoning, proposing a technique for materialising the closure of an RDF graph based on MapReduce. We have implemented our approach on top of Hadoop and deployed it on a compute cluster of up to 64 commodity machines. We show that a naive implementation on top of MapReduce is straightforward but performs badly and we present several non-trivial opt...

متن کامل

Towards an Ontology-Based Semantic Approach to Tuning Parameters to Improve Hadoop Application Performance

Hadoop MapReduce assists companies and researchers to deal with processing large volumes of data. Hadoop has a lot of configuration parameters that must be tuned in order to obtain a better application performance. However, the best tuning of the parameters is not easily obtained by inexperienced users. Therefore, it is necessary to create environments that promote and motivate information shar...

متن کامل

A Fast Algorithm for Covering Rectangular Orthogonal Polygons with a Minimum Number of r-Stars

Introduction This paper presents an algorithm for covering orthogonal polygons with minimal number of guards. This idea examines the minimum number of guards for orthogonal simple polygons (without holes) for all scenarios and can also find a rectangular area for each guards. We consider the problem of covering orthogonal polygons with a minimum number of r-stars. In each orthogonal polygon P,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014